AITopics | hidden size

Collaborating Authors

hidden size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continual Learning with Query-Only Attention

Bekal, Gautham, Pujari, Ashish, Kelly, Scott David

arXiv.org Artificial IntelligenceNov-4-2025

Continual learning involves learning from a stream of data without repetition of data points, a scenario that is inherently complex due to distributional shift across tasks. We propose a query-only attention mechanism that discards keys and values, yet preserves the core inductive bias of transformer architectures. In continual learning scenarios, this simplified mechanism significantly mitigates both loss of plasticity and catastrophic forgetting, outperforming baselines such as selective re-initialization. We establish a conceptual link between query-only attention, full transformer attention, and model agnostic meta-learning, framing them as instances of meta-learning. We further provide intuition for why query-based models and attention networks help preserve plasticity in continual settings. Finally, through preliminary Hessian spectrum analysis, we observe that models maintaining higher curvature rank across tasks tend to retain plasticity. Our findings suggest that full attention may not be essential for capturing the benefits of meta-learning in continual learning.

artificial intelligence, machine learning, plasticity, (16 more...)

arXiv.org Artificial Intelligence

2510.00365

Genre: Research Report > New Finding (0.86)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Robust Mixture Models for Algorithmic Fairness Under Latent Heterogeneity

Li, Siqi, Liu, Molei, Tian, Ziye, Hong, Chuan, Liu, Nan

arXiv.org Machine LearningSep-23-2025

Standard machine learning models optimized for average performance often fail on minority subgroups and lack robustness to distribution shifts. This challenge worsens when subgroups are latent and affected by complex interactions among continuous and discrete features. We introduce ROME (RObust Mixture Ensemble), a framework that learns latent group structure from data while optimizing for worst-group performance. ROME employs two approaches: an Expectation-Maximization algorithm for linear models and a neural Mixture-of-Experts for nonlinear settings. Through simulations and experiments on real-world datasets, we demonstrate that ROME significantly improves algorithmic fairness compared to standard methods while maintaining competitive average performance. Importantly, our method requires no predefined group labels, making it practical when sources of disparities are unknown or evolving.

dataset, hidden size, worst-group performance, (12 more...)

arXiv.org Machine Learning

2509.17411

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Europe > France (0.04)

Genre:

Research Report > Experimental Study (0.97)
Research Report > New Finding (0.71)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

b3f61131b6eceeb2b14835fa648a48ff-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 21:51:15 GMT

gradient, hyperparameter, training bnn, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model

Chen, Dayin, Shi, Xiaodan, Jiang, Mingkun, Zhang, Haoran, Zhang, Dongxiao, Chen, Yuntian, Yan, Jinyue

arXiv.org Artificial IntelligenceAug-1-2024

Photovoltaic power forecasting (PVPF) is a critical area in time series forecasting (TSF), enabling the efficient utilization of solar energy. With advancements in machine learning and deep learning, various models have been applied to PVPF tasks. However, constructing an optimal predictive architecture for specific PVPF tasks remains challenging, as it requires cross-domain knowledge and significant labor costs. To address this challenge, we introduce AutoPV, a novel framework for the automated search and construction of PVPF models based on neural architecture search (NAS) technology. We develop a brand new NAS search space that incorporates various data processing techniques from state-of-the-art (SOTA) TSF models and typical PVPF deep learning models. The effectiveness of AutoPV is evaluated on diverse PVPF tasks using a dataset from the Daqing Photovoltaic Station in China. Experimental results demonstrate that AutoPV can complete the predictive architecture construction process in a relatively short time, and the newly constructed architecture is superior to SOTA predefined models. This work bridges the gap in applying NAS to TSF problems, assisting non-experts and industries in automatically designing effective PVPF models.

architecture, forecasting, search space, (14 more...)

arXiv.org Artificial Intelligence

2408.00601

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > China > Heilongjiang Province > Daqing (0.24)
North America > United States > Oklahoma > Cleveland County > Norman (0.14)
(13 more...)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable > Solar (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Luo, Yudong, Pan, Yangchen, Wang, Han, Torr, Philip, Poupart, Pascal

arXiv.org Artificial IntelligenceJun-28-2024

This inefficiency stems from two main facts: a focus on tail-end performance that overlooks many sampled trajectories, and the potential of gradient vanishing when the lower tail of the return distribution is overly flat. To address these challenges, we propose a simple mixture policy parameterization. This method integrates a risk-neutral policy with an adjustable policy to form a risk-averse policy. By employing this strategy, all collected trajectories can be utilized for policy updating, and the issue of vanishing gradients is counteracted by stimulating higher returns through the risk-neutral component, thus lifting the tail and preventing flatness. Our empirical study reveals that this mixture parameterization is uniquely effective across a variety of benchmark domains. Specifically, it excels in identifying risk-averse CVaR policies in some Mujoco environments where the traditional CVaR-PG fails to learn a reasonable policy.

conference paper, cvar-pg, learning rate, (14 more...)

arXiv.org Artificial Intelligence

2403.11062

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Ontario (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Unleash Graph Neural Networks from Heavy Tuning

Lin, Lequan, Shi, Dai, Han, Andi, Wang, Zhiyong, Gao, Junbin

arXiv.org Artificial IntelligenceMay-21-2024

Graph Neural Networks (GNNs) are deep-learning architectures designed for graph-type data, where understanding relationships among individual observations is crucial. However, achieving promising GNN performance, especially on unseen data, requires comprehensive hyperparameter tuning and meticulous training. Unfortunately, these processes come with high computational costs and significant human effort. Additionally, conventional searching algorithms such as grid search may result in overfitting on validation data, diminishing generalization accuracy. To tackle these challenges, we propose a graph conditional latent diffusion framework (GNN-Diff) to generate high-performing GNNs directly by learning from checkpoints saved during a light-tuning coarse search. Our method: (1) unleashes GNN training from heavy tuning and complex search space design; (2) produces GNN parameters that outperform those obtained through comprehensive grid search; and (3) establishes higher-quality generation for GNNs compared to diffusion frameworks designed for general neural networks.

accuracy, configuration, neural network, (16 more...)

arXiv.org Artificial Intelligence

2405.12521

Country: North America (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Comparative Study on Unsupervised Anomaly Detection for Time Series: Experiments and Analysis

Zhao, Yan, Deng, Liwei, Chen, Xuanhao, Guo, Chenjuan, Yang, Bin, Kieu, Tung, Huang, Feiteng, Pedersen, Torben Bach, Zheng, Kai, Jensen, Christian S.

arXiv.org Artificial IntelligenceSep-10-2022

The continued digitization of societal processes translates into a proliferation of time series data that cover applications such as fraud detection, intrusion detection, and energy management, where anomaly detection is often essential to enable reliability and safety. Many recent studies target anomaly detection for time series data. Indeed, area of time series anomaly detection is characterized by diverse data, methods, and evaluation strategies, and comparisons in existing studies consider only part of this diversity, which makes it difficult to select the best method for a particular problem setting. To address this shortcoming, we introduce taxonomies for data, methods, and evaluation strategies, provide a comprehensive overview of unsupervised time series anomaly detection using the taxonomies, and systematically evaluate and compare state-of-the-art traditional as well as deep learning techniques. In the empirical study using nine publicly available datasets, we apply the most commonly-used performance evaluation metrics to typical methods under a fair implementation standard. Based on the structuring offered by the taxonomies, we report on empirical studies and provide guidelines, in the form of comparative tables, for choosing the methods most suitable for particular application settings. Finally, we propose research directions for this dynamic field.

data mining, machine learning, roc pr cnn-ae 0, (15 more...)

arXiv.org Artificial Intelligence

2209.04635

Country:

North America > United States > New York (0.04)
Asia > China (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre:

Research Report (0.83)
Overview (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
Health & Medicine > Diagnostic Medicine (0.67)
Information Technology > Security & Privacy (0.65)
Law Enforcement & Public Safety > Fraud (0.47)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Structured Diversification Emergence via Reinforced Organization Control and Hierarchical Consensus Learning

Li, Wenhao, Wang, Xiangfeng, Jin, Bo, Sheng, Junjie, Hua, Yun, Zha, Hongyuan

arXiv.org Artificial IntelligenceFeb-9-2021

When solving a complex task, humans will spontaneously form teams and to complete different parts of the whole task, respectively. Meanwhile, the cooperation between teammates will improve efficiency. However, for current cooperative MARL methods, the cooperation team is constructed through either heuristics or end-to-end blackbox optimization. In order to improve the efficiency of cooperation and exploration, we propose a structured diversification emergence MARL framework named {\sc{Rochico}} based on reinforced organization control and hierarchical consensus learning. {\sc{Rochico}} first learns an adaptive grouping policy through the organization control module, which is established by independent multi-agent reinforcement learning. Further, the hierarchical consensus module based on the hierarchical intentions with consensus constraint is introduced after team formation. Simultaneously, utilizing the hierarchical consensus module and a self-supervised intrinsic reward enhanced decision module, the proposed cooperative MARL algorithm {\sc{Rochico}} can output the final diversified multi-agent cooperative policy. All three modules are organically combined to promote the structured diversification emergence. Comparative experiments on four large-scale cooperation tasks show that {\sc{Rochico}} is significantly better than the current SOTA algorithms in terms of exploration efficiency and cooperation strength.

agent, algorithm, intention, (12 more...)

arXiv.org Artificial Intelligence

2102.04775

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Virginia (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback